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Abstract — We study low-delay error correction codes for 
streaming recovery over a class of packet-erasure channels that 
introduce both burst-erasures and isolated erasures. We propose 
a simple, yet effective class of codes whose parameters can be 
tuned to obtain a tradeoff between the capability to correct burst 
and isolated erasures. Our construction generalizes previously 
proposed low-delay codes which are effective only against burst 
erasures. 

We establish an information theoretic upper bound on the 
capability of any code to simultaneously correct burst and 
isolated erasures and show that our proposed constructions meet 
the upper bound in some special cases. We discuss the operational 
significance of column-distance and column-span metrics and 
establish that the rate 1/2 codes discovered by Martinian and 
Sundberg [IT Trans. 2004] through a computer search indeed 
attain the optimal column-distance and column-span tradeoff. 

Numerical simulations over a Gilbert-Elliott channel model 
and a Fritchman model show significant performance gains over 
previously proposed low-delay codes and random linear codes 
for certain range of channel parameters. 

I. Introduction 

Emerging applications such as interactive video conferenc- 
ing, voice over IP and cloud computing are required to achieve 
an end-to-end latency of less than 200 ms. The round-trip time 
in traditional networks can alone approach this limit. Hence 
it is necessary to develop new delay-optimized networking 
protocols and delay-sensitive coding techniques in order to 
meet such stringent delay constraints. In this paper we focus 
on low-delay error correction codes for streaming data at the 
application layer. Commonly used error correction codes op- 
erate on message blocks. To apply them to streaming data, we 
need to either buffer data packets at the encoder or accumulate 
all packets at the decoder before any recovery is possible. To 
reduce delay we need to keep the codeword lengths short, 
which in turn reduces the error correction capability. 

The fundamental limits of delay-constrained communication 
are very different from the classical Shannon capacity. It is 
well known for example that the Shannon capacity of an 
erasure channel only depends on the fraction of the packets 
lost over the channel. However when delay constraints are 
imposed, the pattern of packet losses becomes significant. As 
a toy example, consider two different communication channels 
as shown in Fig. [T] with different loss patterns. The first channel 
introduces up-to two erasures in any sliding window of length 
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Channel Model (a) 
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Channel Model (b) 

Fig. 1. Two packet erasure channels with a different loss structure. The first 
channel has no more than two erasures in a sliding window of length four 
whereas the second channel can have up-to four erasures in a single burst 
followed by a guard spacing of at-least four non-erased packets. The shaded 
packets are erased symbols. A similar example also appears in 

four. The second channel can erase up-to four packets in a 
burst, but any burst must be followed by a guard interval of 
at-least four non-erased packets. Clearly both channel models 
have a loss rate of 50%. However the decoding deadlines that 
can be realized over these channels can be very different. For 
the first channel, we can use a short (4, 2) erasure-correction 
code and recover each source packet with a deadline of r = 4 
time units. For the second channel we need to use a (8, 4) 
erasure correction code and this yields a deadline of r = 8 
time units. 

Surprisingly it turns out that the decoding delay on the 
second channel can be reduced to t = 5 by using a rate 
1 /2 delay-optimal code for the burst-erasure channel proposed 
in QJ-[3|. Unlike traditional codes, these constructions recog- 
nize the different recovery deadlines of streaming data, and 
do not wait to recover all the erased packets simultaneously. 
Instead they exploit the burst-structure of the channel to enable 
selective recovery of earlier data. In particular, following the 
erasure burst between f € [1,4] the code recovers only the data 
packet s[l] at time t = 5, the data packet s[2] at time t = 6 
etc. Such low-delay constructions exist for any burst-erasure 
channel with a maximum burst-length and a given delay. We 
will refer to these constructions as streaming codes (SCo) in 
this paper and the associated feature of recovering successive 
source packets in a sequential manner as streaming recovery. 

One weakness of the SCo codes |[TJ — (3j is that their per- 
formance is sensitive to isolated packet losses. As reported 
in our simulations over a Gilbert-Eliott channel model, the 
error-correction capability of the code deteriorates significantly 



when we introduce just a small loss probability in the good 
state. Motivated by this observation, we study low-delay 
error correction codes for a class of channels that introduce 
both burst erasures and isolated erasures. Fig. [2] provides an 
example of such a channel. In any sliding window of a given 
length W, the channel can introduce either a certain number of 
erasures in arbitrary locations or an erasure burst of a certain 
maximum length. As we observe in simulations, low-delay 
codes for such channels also perform well over Gilbert Eliott 
channels and other related channels. 

One simple construction for such channels is based on 
concatenation of two different codes. We generate one set of 
parity checks from a standard erasure code and another set 
from the SCo code and then concatenate the two parity checks 
in the transmitted packet. The former parity checks can be used 
when the window of interest has isolated erasures whereas the 
latter parity checks can be used when it has burst-erasures. 
Unfortunately such an approach can introduce a significant 
overhead and is not desirable. 

From a code design viewpoint, codes with large column 
distance can correct large number of isolated erasures, whereas 
codes with large column span can correct large bursts. Thus we 
seek codes with large column distance (dr) and column span 
(ct) for channels with both burst and isolated losses. Naturally 
there exists a tradeoff between these parameters. We establish, 
to our knowledge, the first information theoretic outer bound 
on the achievable (d<r,C;r) for any code of a given rate. This 
bound enables us to verify that some of the code constructions 
reported using a computer search in [3] are indeed optimal. 

Our proposed construction divides each source packet s[i] 
into two groups of sub packets say sa[i] and SB[i], It generates 
separate parity checks pa['] an d Pb['] for each group and 
combines the parity checks pA[t] +Ps[t — A] after a suitable 
time-shift of A. By increasing the shift A we tradeoff the 
column distance for a larger column span. Our construction is 
optimal for R = 1/2. Codes with either a maximum value of 
dr or ct also appear as special cases in this construction. 

One practical appeal of our constructions is the ability to 
perform trade-off between correcting burst and isolated losses 
using a simple mechanism. This means the same encoder and 
decoder can work with different channels with different mix 
of burst and isolated losses by simply adjusting the shift A. 
Furthermore, such trade-off can be adjusted mid-session if the 
application identifies a change in prevalent network conditions. 
Since only a single parameter is involved, it contains negligible 
overhead to send A in each packet so that trade-offs can be 
made without explicit signalling that could add delay. 

We point the reader to p)-fT7| for additional works on error 
control mechanisms for streaming. 

II. System Model 

We study low-delay error correction codes for a particular 
channel model with the following property. Take any sliding 
window of length W. The channel can introduce either a single 
erasure burst of length B or a maximum of N erasures in arbi- 
trary locations, but no other erasure pattern. We will generally 
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Fig. 2. A channel model with a mixture of burst-erasures and isolated 
erasures. In any sliding window of length W = 5 there is either a single 
erasure burst of length B = 3 or up-to N = 2 erasures. 

assume that N < B since the set of arbitrary erasures includes 
the burst-erasure pattern as a special case. An example of such 
a channel with W = 5, B = 3 and N = 2 is provided in 
Fig.0 

We assume a deterministic source arrival process. At time 
i > 0, the encoder is revealed a source packet s[i] which we 
assume is a symbol from a source alphabet S. At time i the 
encoder generates a channel symbol x[i] which belongs to a 
channel input alphabet X. The channel symbol is a causal 
function of the source symbols, i.e. 

x[i] =/ i (s[0],... ) s[i]), z>0. (1) 

The channel output is given by either y[i) = x[i], when 
the packet is not erased and by y[i] = *, when the packet is 
erased. Given the channel output, the decoder is required to 
reconstruct each packet with a delay of T units i.e. [J 

s[{\=9i(y[0],...,y[i + T]). (2) 

Remark 1: In contrast to (n, k) block code, where k in- 
formation symbols are mapped to n codeword symbols, the 
proposed setup maps a stream of incoming source packets 
over an alphabet S to a stream of channel packets over the 
alphabet X. To add redundancy we require that \X\ > \S\. 

I CI 

A rate R = is achievable if there exists a feasible code 
that recovers every erased symbol s[i] by time i + T from any 
permissible channel i.e., the channel introduces no more than 
N arbitrary erasures or a single erasure-burst of length up-to 
B in any sliding window of length W . 

For the rest of the paper, we set W = T+l as the analysis is 
most convenient for this choice. The interplay between delay 
and the channel-dynamics also appears most interesting in this 
regime. For T ^> W the delay constraint is not particularly 
active, while for T <C W the guard separation between packet 
losses can be generally large. 

III. Distance and Span Metrics 

Let Fg denote a finite-field of size q. For convenience we let 
S = and X = W q L . We view the input symbols s[i] = Si as a 
length k vector over F q and x[i] = x^ as a length n vector over 
F q . We restrict our attention to time-invariant linear (n, k, m) 
convolutional codes specified by x.i = 2^j1o s i-j^j where 
Go, . . . , G m are generator matrices over F^ x ™. 

The first T output symbols can be expressed as, 

[x ,xi, . . . ,x T ] = [s , si, . . . ,s T ] • G T . (3) 

'Notice that the total number of channel packets involving s[i] before its 
recovery is T + 1. 
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is the truncated generator matrix to the first T + 1 columns. 
Note that Gj = if j > m. For the low-delay property the 
minimum distance and span properties of G T are important as 
discussed below. Such a connection was discussed in [3 1 and 
used to perform a computer search of good low-delay codes. 

Definition 1 (Column Distance): The column distance of 
G T is defined as 



dr = 



mm 

;[s ,si,... 
so/0 



wt(s • G T ) 



where wt(x) equals to the Hamming weight of the vector x. 

We refer the reader to fT8] Chapter 3] for some properties 
of dr- 

Fact 1: A convolutional code with a column distance of 
dr can recover every information symbol with a delay of T 
provided the channel introduces no more than TV = dr — 1 
erasures in any sliding window of length T + 1. Conversely 
there exists at-least one erasure pattern with dr erasures in a 
window of length T + 1 where the decoder fails to recover all 
source packets. 

To the best of our knowledge the column span of a con- 
volutional code was first introduced in (3) in the context of 
low-delay codes for burst erasure channels. 

Definition 2 (Column Span): The column span of G T is 
defined as 

ct = min span(s • G T ) 

s=[s ,si,...,sr] 

where span(x) computes the length of the support of the 
vector x i.e., span(x) = j — i + 1, where j is the last index 
where x is non-zero and i is the first such index. 

Fact 2: A necessary and sufficient condition for a convolu- 
tional code to recover every erased symbol with a delay of T 
from a channel that introduces no more than a single erasure 
burst of maximum length B in any sliding window of length 
T + 1 is that c T > B. 

We omit a justification of these results due to space con- 
straints. 

It follows from Facts [T] and [2] that a necessary and sufficient 
condition for any convolutional code to recover each source 
packet with a delay of T over a channel that introduces either 
TV arbitrary erasures or B consecutive erasures in a sliding 
window of length T + 1 is that dr > N and ct > B. 
Thus it is of interest to investigate code constructions that 
simultaneously have a large column distance and a large 
column span. 

It turns out that large column-distance and large column- 
span are conflicting requirements in general. The following 
Theorem provides an outer-bound on the set of all achievable 
pairs (cT,dx) for any code of a given rate. 
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Fig. 3. The periodic erasure channel used to prove an upper bound on 
capacity in Theorem^ Here Ai = cj- — 1 and A2 = T — dj< + 2 holds. 
The shaded symbols are erased while the remaining ones are received by the 
destination. 
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Fig. 4. One period of the periodic erasure channel in Fig. [5] 



Theorem 1 ( Column-Distance and Column-Span Tradeoff): 
For any rate R convolutional code with a column distance of 
dr and a column span of cr, must satisfy : 



R 



1-R 



c T + d T < T + 1 



1 



1-R' 



(5) 



as well as dr < ct and cr < T + 1. 

Proof: We consider a periodic erasure channel with a 
period of P = T + cr — dr + 1 and suppose that in every 
such period the first B — cr — 1 symbols are erased. We 
claim that for any convolutional code with a column-span and 
column-distance of cr and dr respectively, the decoder can 
reconstruct every source packet from such an erased sequence. 

Consider the first period that spans the interval [0, P — 1]. 
The first cr — c?t erased symbols all need to be recovered by 
time t = P — 1. Thus in the window of interest, these symbols 
only experience a single erasure burst of length cr — 1 or 
smaller. From Fact [2] these symbols can be recovered by any 
code with column span of or- 

The next dr — 1 symbols have a deadline after time P — 1. 
To recover s[t] for t £ [cr — dr + 1, cr — 1] observe that 
the length T + 1 window Wt = [t,t + T] has two erasure 
bursts — one at the start and one at the end of the interval. 
As shown in Fig. [4] each such interval has a total of T — dr + 2 
non-erased symbols. Thus the total number of erased symbols 
equals T + 1 - (T - d T + 2) = dr - 1. From Fact[T] a code 
with a column distance of dr can recover all of these symbols. 

Finally for t £ [dr,P — 1], the recovery window Wt = 
[t, t + T] only sees a single-erasure burst of length cr — 1 and 
hence the column span of cr suffices to recover these symbols. 

Having recovered all the symbols in [0, P — 1] by their 
deadline, we can cancel their effect in all future parity checks 
and repeat the same argument for every other period. Thus we 
can recover all erased symbols. Thus the rate of the code is 
upper bounded by the capacity of the periodic erasure channel 
which results in 



R< 1 - 



(6) 



ct — 1 
T + ct — dr + 1 

Rearranging, this equation reduces to Q. The upper bound 
dr < ct follows by observing that a code that corrects 
dr — 1 arbitrary erasures in a sliding window of length T + 1 
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Fig. 5. A window of T+l channel packets showing the code construction of 
Streaming Codes (SCo). v denotes the set of symbols (v[t — T], . . . , v[t]). 
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Fig. 6. A window of T + 1 channel packets showing the code construction 
of Embedded Random Linear Codes (E-RLC). 



trivially corrects an erasure burst of the same length. The 
bound ct < T + 1 simply follows from the definition. ■ 
Remark 2: Substituting R = i, the expression in |5} re- 
duces to the following upper bound 



CT 



d T < T + 3. 



(7) 



We conclude that the R — 1/2 codes found via a computer 
search in [3 Section V-B] are indeed optimal as they all 
satisfy We next propose a family of codes that meet the 
upper bound |7]i when R = 1/2. 

IV. Embedded Random Linear Codes 

We introduce a construction that provides a flexible tradeoff 
between the column-distance and column-span discussed in 



Section III This family includes codes with maximum column 
distance and maximum column span as special cases. Hence 
we discuss these special cases first. 

A. Maximum Column- Distance Codes 

As stated in Theorem [T] we always have that ct > g?t- For 
the maximum column distance, we ct = g?t in |5]), 



d T < l + (l-i?)(T+l). 



(8) 



The upper bound is the singleton-bound equivalent for convo- 
lutional codes [18, Chapter 3]. The upper bound is achieved 
whenever the generator matrix in Q has a full rank prop- 
erty i.e., any set of k(T + l) columns are linearly independent. 
By selecting the entries in from a sufficiently large finite 
field, we can satisfy this property with high probability. We 
will refer to this construction as a Random Linear Code (RLC). 

B. Maximum Column-Span Codes 

Clearly any convolutional code with a column span of ct > 
2 is guaranteed to have dr > 2. The later simply implies that 
at-least one erasure can be corrected in a window of length 
T + l. Substituting d T = 2 in ^ and using c T < T + 1, 

1 



ct < 1 + T ■ min 



R 



- 1,1 



(9) 



A class of codes, SCo with this property is constructed 
in (2j, (3). Due to space constraints do not review the code 
construction but refer the reader to [2], 1 3 1 Instead, we describe 



a related construction that also achieves the maximum column 
span. The advantage of this construction is that it generalizes 
to constructions that simultaneously have large column span 
and column distance. This construction is illustrated in Fig. [5] 
and the main steps are as described below. 
Encoding: 

1) Split each source symbol into a total of T sub-symbols 
over F q , belonging to two groups as shown below. 



u [i\, u B -i[i\, v [i], . . . , v T -B-i[i\ 
v v ' v v ' 

=u[t] =v[»] 



(10) 

2) Apply a (T, T — B) systematic random linear code to 
the source symbols v[i] and generate B parity checks 
Pv[i] = (Pa[i], ■ ■ ■ ,PB-x[i\) at time i i.e., 

T-l 

P«M = ^v[i-i]-G J - (11) 

3=1 

where Gj € ¥^ BxB . It can be verified from ^ that 
such a code can recover up-to B erasures in a window 
of length T. 

3) Apply a repetition code to u[i] with a delay of T and 
then combine them with v[z] i.e., 

u[i] \ 

v[»1 • (12) 

p v [i\®u[i-T] J 

Suppose that an erasure burst spans t e [0,5-1] (c.f. Fig. [5). 
The receiver needs to recover s[j] by time j + T for j E 
{0, . . . , B — 1}. Our proposed decoder uses the parity checks 
of the random linear code to first recover all the symbols in 
v[j] simultaneously by time T — 1. Having recovered these 
symbols the decoder sequentially recovers the symbols u\j) 
at time j + T using the repetition code. More specifically the 
decoder implements the following steps. 
Decoding: 

• Recover the parity checks symbols p v [B], . . . ,p v [T — 1] 
from x[S], . . . , x[T — 1] by cancelling the symbols u[t] 
for t < that are not erased. 

• Recover the symbols v[0],...,v[B — 1] from par- 
ity checks p„ [B] , . . . , p v [T — 1] using random linear 
code ( fTT) . 

• For j 6 [0, B — 1], at time j + T, first compute the parity 
check p v [j + T] which is a function of symbols v[i] that 
have been recovered already and then subtract it from 
u[j] +p v [j + T] to recover u[j]. Thus the source symbol 
s\j] — (u[j],v[j)) is recovered by time j + T although 
the symbols v[j] is recovered by time T — 1. 

• All the erased symbols are recovered by time t = T+B — 
1. The encoder can recover from a second erasure-burst 
starting at time t = T + B or later. This is equivalent to 
the condition that = B + 1. 

Notice that the proposed construction takes a RLC code 
over v[-] as a base code and embeds additional symbols u[-]. 



The parity checks of u[ ] are simple repetition codes and 
directly combined with p„[«] after a shift of T. Thus the rate 
increases over the base RLC code upon addition of u[ ]. In the 
generalization of this construction we replace the repetition 
code with another RLC code. 

C. Proposed Construction 

The use of a repetition code in the previous section limits 
the column distance to dr = 2. To improve the column 
distance we first replace the repetition code for u[ ] with 
another random linear code. Furthermore instead of applying 
a shift of T to the parity checks of the u[-] symbols we apply 
a shift of A < T. In particular we construct the parity checks 
p„ [i] as in (jTTJ and construct a second set of parity checks 



T-A 

E 



u[i- j]Hj, 



(13) 




We will assume that u e F^ and v G F^ and the parity 
checks pgF™. 



(14) 



We will assume that the entries in the matrices of EL, and 
are all sampled uniformly at random and q is sufficiently large 
so that all the sub-matrices of interest have either full row-rank 
or column-rank with high probability. The code construction 
is illustrated in Fig. [6] The rate of the code is given by, 

u + v 

R = w^—- d5) 

2u + v 

We develop closed form expressions for the column span 
and column distance of the proposed code construction below. 

Proposition 1: The column span of the Embedded-Random 
Linear Code with a shift of A is given by 



^A 



1. 



(1-R)(T +!) + !, 



R < A 

r> ^ A 
JX S X+X 



(16) 



Proof: To compute the column span it is sufficient to 
find largest erasure burst length B starting at time t = 0, 
such that s[0] can be recovered by time t = T. Note that the 
parity checks p u [] involving u[0], . . . u[B — 1] appear from 
time t = A, . . . , A + B — 1. The parity checks in the interval 
[B, A — 1] do not involve any u[-] symbols that are erased. 

We first find the condition under which the parity-checks 
in the interval [B, A — 1] can be used to recover all the v[-] 
symbols in time [0, B — 1]. Since there are a total of A — B 
parity check symbols, each contributing u equations and a total 
of B erased symbols, each generating v unknowns we must 
have that 

B-v<(A~B)u (17) 

which implies from ( fT3] > that B < ^j^A. Once all the 
erased v[-] symbols are recovered, their contribution can be 



cancelled from future parity checks and the symbol u[0] can 
be recovered at time A < T. 

If ( fT7| ) is not satisfied then all the u[-] and v[-] symbols 
need to be simultaneously recovered at time t — T. There are 
a total of T + 1 — B non-erased parity check symbols in the 
interval [B,T] and each parity check contributes u equations. 
The total number of unknowns from the B erased symbols is 
B(u + v). Thus we must have that 



B{u + v) < (T+ 1 - B)u 



(18) 



which leads to B < (1 - R)(T + 1). Thus it follows that the 
maximum burst length that can be corrected is given by 



B max^ (l-i?)(T + l),i-^A 



(19) 



from which the claim easily follows. ■ 
Remark 3: Our result in Prop. [T] shows that to improve the 
column span over a random linear code one must take the shift 
to satisfy A > R - + 

Proposition 2: The column-distance of the Embedded- 
Random Linear Code with a shift of A > R(T + 1) and 
R > | is given by the following 



d T = l —^{T-A) + 2 



(20) 



Proof: 

We need to show that for any erasure sequence in the 
window [0,T] if the symbol s[0] is not recovered by time 
t = T then the number of erasures must be at-least dx- 

We first observe that if the symbol v[0] is not recovered by 
time t = A — 1 then the code behaves like a random linear 
code. Every parity check sub-symbol provides one indepen- 
dent equation. The total number of erasures necessary is given 
by the column distance of the random linear code (fSj, which 
is the maximum possible column distance and exceeds pO} . 

Thus we only need to consider those erasure patterns where 
v[0] is recovered by time t = A — 1, In addition to s[0] we 
consider three groups of symbols. Group 1 consists of k\ 
symbols that are erased in time t 6 [1,A — 1] such that 
the corresponding v[-] is recovered by time t = A — 1. Group 
2 consists of ki symbols erased in the same interval whose 
v[-] symbols are not recovered by time t = A — 1. Group 3 
consists of k^ symbols erased in the time t £ [A,T]. We 
seek the minimum possible value of ki + k 2 + k 3 such that 
the symbol u[0] is not recovered by time t = T. 

Since R>\ and A > R(T+1) we have that 2A > T + 1. 
Thus the u[ ] symbols of group 3 are involved in parity checks 
after time t = T and hence do not need to be considered. 

We consider a possibly sub-optimal decoder that attempts 
to recover the remaining symbols using only the parity checks 
in the interval t E [A, T]. Clearly, by using such a sub-optimal 
decoder we can only under-estimate the number of erasures 
that can be corrected. Since the symbol u[0] start appearing in 
the parity checks starting at time t = A (c.f. ( fT4] i) and is not 
recovered by time t — T (by assumption), each of the parity 
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checks sub-symbols in the interval [A, T] provides one non- 
redundant equation. Thus it follows that the total number of 
unknown associated with the remaining erased symbols must 
exceed the number of available parity check equations. The 
total number of unknowns is upper bounded by a sum of three 
terms: 

• The u[-] symbols in group 1 and s[0]: iVi = (fci + l)u 

• Both u[-] and v[-] symbols in group 2: N 2 — kz(u + v) 

• The v[-] symbols in group 3: N3 = £3 • V 

The total number of available equations from the parity checks 
in the interval [A, T] where there are £3 erasures is given by 
(T — A — k 3 + l)u. Thus a necessary condition under which 
u[0] is not recovered is given by: 



N 1 +N 2 + N 3 >{T-A-k 3 + l) 



(21) 



Upon substituting for Ni and through some simple algebra we 
get that 



(fci + k 2 + k 3 )u + (k 2 

which in turn implies that 

u 

h + k 2 + k 3 > — ■ — (T — 



k 3 )v > (T — A)u (22) 



A) 



1-R 
R 



(T - A) (23) 



Thus the total number of erasures in any such sequence must 
exceed 1 + ^^(T - A) as stated in §Z(^. ■ 

Remark 4: For the special case of R = 1/2 and A > 
note that from Prop. [T] that ct = A + 1. From Prop. [2] we 
have that <1t = T — A + 2. Thus we have that d,T + ct = 
T+3, which meets the upper bound in (|7). Thus the proposed 
embedded-random linear code constructions provide a family 
of codes that are optimal for R = 1/2. As discussed before, the 
embedded-random linear codes are also optimal in the special 
case of maximum column span or maximum column distance. 
Their optimality in other cases remains to be seen. The gaps 
between the upper-bound given in |7]l and values achieved by 
Embedded-RLC codes are illustrated in Fig. [7] 

V. Simulation Results - Gilbert-Elliott Channel 
Model 

We consider a two-state Gilbert-Elliott channel model [19], 
20). In the "good state" each channel packet is lost with a 



probability of e whereas in the "bad state" each channel packet 
is lost with a probability of 1, We note that the average loss 
rate of the Gilbert-Elliott channel is given by 



Pr(£) 



a 



(24) 



(3 + a a + (3 
where a and j3 denote the transition probability from the good 
state to the bad state and vice versa. 

As long as the channel stays in the bad state the channel 
behaves as a burst-erasure channel. The length of each burst 
is a Geometric random variable with mean of i. When the 
channel is in the good state it behaves as an i.i.d. erasure 
channel with an erasure probability of e. The gap between two 
successive bursts is also a geometric random variable with a 
mean of —, 

a 

Fig. [8] and Fig. 10 show the simulation performance over 
a Gilbert-Elliott Channel. The parameters chosen in the two 
plots are as shown in Table [j] 
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TABLE I 

Gilbert-Elliott Channel Parameters 



We note that the channel parameters for the T = 12 case 
are the same as those used in pi Section 4-B, Fig. 5]. For 
the case when e = we have verified that our simulations 
agree with the results in [3]. Note that we do not use R = 0.5, 
because the SCo codes degenerate into simple repetition codes 
for this case p). We use the next highest rate for each class 
of codes. The choice of /? is smaller for T — 50 because 
we expect to be able to correct longer bursts because of the 
larger delay. The histogram of burst lengths for both channels 
is shown in Fig. [9] and Fig. 1 1 The choice of a is taken to be 
sufficiently small so that the contribution from failures due to 
small guard periods between bursts is not dominant. Note that 
our proposed constructions degenerate to RLC codes when the 
inter-burst gaps are smaller than the decoding delay and hence 
the performance gains are not observed in that regime. 
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TABLE II 

Column Distance and Span for Embedded-RLC Codes for ome 
Rates R and Delays T. 



In Fig. [8] and Fig. 10 we observe that our Embedded-RLC 
constructions provide improved error correction capability 



istogram of Burst Lengths tor Gilbert-Elliott - {ufi) = (5E-4.0.5) 




Fig. 8. Simulation over a Gilbert-Elliott Channel with (a, 3) ■ (5 X 
1CT 4 , 0.5). All codes are evaluated using a decoding delay of T = 12 symbols. 



Fig. 9. Histogram of Bursts when 8 = 0.5 which approximates a geometric 
distribution (shown dotted) with success probability of 0.5. 



Gilbert-Elliott Channel - (u,0) = (1e-005,0.1), T = 50 
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Fig. 10. Simulation over a Gilbert-Elliott Channel with (a, 8) = (10~ 5 , 0.1). 
All codes are evaluated using a decoding delay of T = 50 symbols. 



Fig. 11. Histogram of Bursts when 8 = 0.1 which approximates a geometric 
distribution (shown dotted) with the same success probability. 



over both the SCo codes and RLC codes by virtue of their 
longer column span and column distance. We discuss the 
performance of various codes in more detail below. 

• Uncoded Loss Rate: The uppermost plot in Fig. [8] and 
Fig. [TO] is the uncoded packet loss rate. It agrees well 
with the expression in ( |24) , 

• Random Linear Codes: The solid horizontal black line 
is the loss-rate of the Random Linear Code (RLC) in 
Section IIV-AI which has the maximum column distance. 
We see that for the range of e that we consider the RLC is 
able to correct all the erasures in the good state and hence 
the loss rate does not depend on e. The only losses that 
occur are when the burst-lengths in the bad state exceed 
B = 6 in Fig. [8] and B = 25 in Fig. 10 The loss rates 
of Pr(£) k, 4 x 10~ 5 and Pr(£ ) « lO^observed in the 
two cases are consistent with the probability of observing 
such long bursts. 

• Streaming Codes: The SCo Codes are represented by the 
red plot. We see that in the interval of e considered, there 
is a noticeable increase in the loss rate. The performance 
is better than RLC codes for e rj 10~ 3 but deteriorates 
quickly as we increase e. The packet-loss probability 
increases in proportion to e 2 as dy = 2 for these codes. 

• Embedded-Random Linear Codes: The associated 
column-distance and column-span of these codes from 



Proposition [TJ and [2] are indicated in Table [TT] For T = 12 
case, the performance of the Embedded-Random Linear 
Codes with shifts of A g {10, 11} is shown in Fig. [8] 
The shift of A = 11 also has a column distance of 2. It 
follows a similar trend as SCo codes and its performance 
deteriorates quickly with e. The shift of A = 10 provides 
the best performance in Fig. [8] This code has a column- 
distance of dx = 3. We observe that the effect of i.i.d. 
erasures is not significant for most of the interval of e 
considered as the loss rate scales as e 3 . These code do 
have a smaller column span than SCo codes and hence 
its performance is slightly worse in the other extreme 
of e « 10~ 3 . For T = 50 case, the Embedded-RLC 



codes with shifts of A G {36,44} are shown in Fig. 10 
Since these codes have a column distance of at-least 7 
the performance does not deteriorate as noticeably as the 
SCo codes in the range of e of interest. The shift of 44 
has the best performance because it has a longer column- 
span and hence can correct longer erasure bursts. 

VI. Simulation Results - Fritchman Channel 
Model 

In this section, we consider a special class of Fritchman 
Channel Model (2TJ with a total of TV + 1 states. One of 
the states is the error free state and the remaining TV states 
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Error - Free States 



0© © 

Error States 



Fig. 12. The Fritchman Model with One Good State and N Error States. 
In each error state the packet is lost with probability 1 whereas in the good 
state it is lost with probability e. 



are error states. Fritchman and related higher order Markov 
models are commonly used to model fade-durations in mobile 
links. 

We let the transition probability from the good state to the 
first error state E\ to be a whereas the transition probability 
from each of the error states equals j3. Let e be the probability 
of a packet loss in good state. We lose packets in any error 
state with probability 1. We consider two scenarios in Fig. 13 



and 15 whose parameters are shown in Table III 
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TABLE III 
Fritchman Channel Parameters 



Fig. 



16 illustrate the empirical histogram of 
sample erasure pattern generated over a 



14 and Fig. 
burst-lengths in a 
channel of 10 8 symbols. The actual distribution is given by 
a negative binomial distribution and is shown by the dotted 
envelope. 

the uncoded loss rate is shown 



In both Fig. 13 and Fig. 15 



by the upper-most plot while the black horizontal line is the 
performance of RLC. Note that the performance of RLC is 
essentially independent of e in the interval of interest. As 
before the RLC codes clean up all the losses in the good 
state and fail against burst lengths longer than its column 
span. The performance of the SCo codes is shown by the 
red-plot in both figures. We note that it is better than the RLC 
code for e — 10 -3 but deteriorates quickly as we increase e. 
There are two dominant error events for SCo codes. One is 
the simultaneous erasure of the symbols in the repetition code. 
The second is the occurrence of an isolated erasure in the good 
state in the interval of length T following a transition from the 
bad state. This particular event is significant for larger values 
of T. 

The parameters of the embedded-RLC codes used in these 
figures are shown in Table |H| In Fig. [13] we observe that the 
shift of A = 32 has the smallest loss-rate over the interval 
of e of interest. The longest burst-length observed in Fig. 14 
is B = 30, which can be recovered by this shift and the 



relatively larger column distance of — 9 makes it more 
resilient than the shift of A = 36. In Fig. 15 we observe that 
the shift of length A = 60 performs best for e < 4 x 10 -3 
whereas the shift of length A = 52 performs best for e > 
4 x 10 -3 . The relatively larger column-span of the former 
helps for small values of e whereas the relatively larger column 
distance of the latter helps for lager values of e. For e f=s 
3 x 10~ 3 , the RLC achieve a loss-probability of s» 4 x 10~ 5 , 
the SCo codes achieve ps 6 x 10~ 5 whereas the proposed 
constructions achieve s» 6 x 10~ 6 . More generally over the 
entire range of e, the best embedded-RLC codes achieve a 
loss rate which is a factor of 10 or more smaller than the SCo 
code and between a factor of 3 to over 10 smaller than the 
RLC code. 

VII. Conclusion 

We study the construction of low-delay codes for streaming 
data over channels that introduce both isolated and burst 
packet losses. We show that good code constructions for 
such channels should simultaneously have large column span 
and column distance. We establish, to our knowledge, the 
first outer bound on the achievable column-span and column- 
distance tradeoff for any convolutional code of a given rate. 
This allows us to establish that some of the code constructions 
previously obtained from a computer search are indeed opti- 
mal. We propose a new class of codes — embedded-random 
linear codes — that divide each source packet into two groups 
of symbols, perform unequal error protection and combine the 
resulting parity checks with a suitable shift. We develop closed 
form expressions for the column distance and column span for 
these codes and demonstrate how the code parameters can be 
tuned to obtain a flexible tradeoff between the column distance 
and column span. Our proposed code constructions achieve the 
outer bound for rate R = 1/2 and also reduce to the known 
constructions such as the random linear codes and burst- 
erasure codes at the extreme points. Numerical simulations 
on the Gilbert-Elliott channel and Fritchman channel indeed 
show significant performance gains over previously proposed 
constructions. 

In terms of future work, it will be interesting to investigate 
optimal code constructions for rates other than R = 0.5. While 
our proposed construction in this paper splits each source- 
packet into two groups, it remains to be seen whether more 
groups are needed in general. It might also be interesting to 
see if the outer bound on column-distance and column-span 
tradeoff can be tightened for certain rate values. Extending 
these results to systems involving more than one communi- 
cation link is also of great importance. Finally experimental 
results over realistic packet loss traces will naturally provide 
a more realistic assessment of the performance gains from our 
delay-optimized code constructions. 
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